feat(execution/grpc): adding support for grpc otlp#3300
Conversation
|
Warning Rate limit exceeded
To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📝 WalkthroughWalkthroughAdds OpenTelemetry propagation via inbound and outbound Connect interceptors; the client and handler now apply these interceptors, OTEL modules are added to go.mod, and tests verify traceparent and baggage propagation plus end-to-end trace identity preservation. Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Client
participant OutIC as Outbound\nInterceptor
participant Prop as OTEL\nPropagator
participant GRPCClient as Connect\nClient
participant InIC as Inbound\nInterceptor
participant Server as Server\nHandler
Client->>OutIC: Call with context (may have baggage/trace)
OutIC->>Prop: Retrieve global text-map propagator
Prop-->>OutIC: Propagator
OutIC->>OutIC: Inject headers into request
OutIC->>GRPCClient: Send request with headers
GRPCClient->>InIC: Deliver request + headers
InIC->>Prop: Retrieve global text-map propagator
Prop-->>InIC: Propagator
InIC->>InIC: Extract headers into context
InIC->>Server: Forward request with extracted context
Server-->>InIC: Response
InIC-->>GRPCClient: Return response
GRPCClient-->>OutIC: Response
OutIC-->>Client: Final response
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Review rate limit: 0/1 reviews remaining, refill in 57 minutes and 11 seconds.Comment |
|
The latest Buf updates on your PR. Results from workflow CI / buf-check (pull_request).
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3300 +/- ##
==========================================
+ Coverage 60.52% 60.72% +0.20%
==========================================
Files 126 127 +1
Lines 13721 13735 +14
==========================================
+ Hits 8304 8340 +36
+ Misses 4513 4491 -22
Partials 904 904
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@execution/grpc/otel_propagation_test.go`:
- Around line 156-157: The test currently writes to shared variable midSpan from
the handler goroutine and reads it from the test goroutine with a time.Sleep,
causing a race; change this to a channel-based handoff: create a chan trace.Span
or chan trace.SpanContext (or chan string for traceID) in the test, have the
handler (the goroutine that currently assigns midSpan) send the span (or
span.Context()/TraceID) on that channel, and have the test goroutine receive
from the channel before asserting (replace uses of midSpan with the received
value); update references to midSpan in the handler and assertions to use the
received value to eliminate the race and remove the timing-dependent sleep.
- Around line 83-89: The test has data races because gotTraceparent, gotBaggage,
and midSpan are written from interceptor/handler goroutines and read from the
test goroutine; replace the unsynchronized string/bool accesses with a sync
primitive (e.g., a single sync.Mutex protecting gotTraceparent and gotBaggage
and a channel or sync.Cond to signal midSpan readiness) so writes in
captureHeader (UnaryInterceptorFunc) and the handler acquire the mutex before
setting values and signal the channel after setting midSpan, and the test
goroutine waits on that channel (or locks the mutex) instead of sleeping; update
all locations referencing gotTraceparent, gotBaggage, and midSpan to use the
chosen synchronization to eliminate race detector warnings.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: acca9aea-2543-4f34-861a-aff1dae1335d
⛔ Files ignored due to path filters (1)
execution/grpc/go.sumis excluded by!**/*.sum
📒 Files selected for processing (5)
execution/grpc/client.goexecution/grpc/go.modexecution/grpc/handler.goexecution/grpc/otel_propagation.goexecution/grpc/otel_propagation_test.go
| gotTraceparent := "" | ||
| captureHeader := connect.UnaryInterceptorFunc(func(next connect.UnaryFunc) connect.UnaryFunc { | ||
| return func(ctx context.Context, req connect.AnyRequest) (connect.AnyResponse, error) { | ||
| gotTraceparent = req.Header().Get("traceparent") | ||
| return next(ctx, req) | ||
| } | ||
| }) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
fd -t f -i "otel_propagation_test.go" | head -20Repository: evstack/ev-node
Length of output: 99
🏁 Script executed:
cat -n execution/grpc/otel_propagation_test.go | head -200Repository: evstack/ev-node
Length of output: 7290
🏁 Script executed:
# Verify no explicit sync primitives protect these variables
rg -A 10 "var (gotTraceparent|gotBaggage|midSpan)" execution/grpc/otel_propagation_test.goRepository: evstack/ev-node
Length of output: 1021
🏁 Script executed:
# Check if there are any sync primitives used elsewhere in the test file
rg "(sync\.|Mutex|WaitGroup|Channel)" execution/grpc/otel_propagation_test.goRepository: evstack/ev-node
Length of output: 41
🏁 Script executed:
# Confirm that GetTxs and GetExecutionInfo are blocking calls that wait for completion
grep -A 5 "func.*GetTxs\|func.*GetExecutionInfo" execution/grpc/otel_propagation_test.go | head -20Repository: evstack/ev-node
Length of output: 41
🏁 Script executed:
# Check if httptest.Server spawns goroutines (standard library behavior verification)
rg "httptest.NewServer" -B 2 -A 2 execution/grpc/otel_propagation_test.goRepository: evstack/ev-node
Length of output: 865
🏁 Script executed:
# Check how client.GetTxs() is defined to confirm it's a blocking call
fd -t f "client.go" execution/grpc/ | head -5Repository: evstack/ev-node
Length of output: 84
🏁 Script executed:
# Look for the client implementation
rg "func.*NewClient" -A 10 execution/grpc/ | head -30Repository: evstack/ev-node
Length of output: 730
🏁 Script executed:
# Verify the blocking behavior by checking the generated client code signature
cat -n execution/grpc/client.go | grep -A 20 "func.*GetTxs"Repository: evstack/ev-node
Length of output: 1082
🏁 Script executed:
# Confirm the blocking nature - check if there's an error return that would indicate completion
rg "client.GetTxs|client.GetExecutionInfo" execution/grpc/otel_propagation_test.go | grep -v "downstreamClient\|upstreamClient"Repository: evstack/ev-node
Length of output: 240
Synchronize header capture and span access to avoid test data races.
gotTraceparent, gotBaggage, and midSpan are written in handler/interceptor goroutines and read in the test goroutine without synchronization. Although the blocking RPC calls provide implicit ordering, the Go race detector requires explicit synchronization primitives (mutex or channels) to recognize the happens-before relationship. Additionally, the fixed 10ms sleep at line 189 violates the determinism guideline; use explicit synchronization instead of timing-based waits.
Suggested fix (mutex-guarded capture)
import (
"context"
"net/http/httptest"
+ "sync"
"testing"
"time"
@@
- gotTraceparent := ""
+ var mu sync.Mutex
+ gotTraceparent := ""
captureHeader := connect.UnaryInterceptorFunc(func(next connect.UnaryFunc) connect.UnaryFunc {
return func(ctx context.Context, req connect.AnyRequest) (connect.AnyResponse, error) {
+ mu.Lock()
gotTraceparent = req.Header().Get("traceparent")
+ mu.Unlock()
return next(ctx, req)
}
})
@@
- if gotTraceparent == "" {
+ mu.Lock()
+ traceparent := gotTraceparent
+ mu.Unlock()
+ if traceparent == "" {
t.Fatalf("expected traceparent metadata to be propagated")
}Apply the same pattern to gotBaggage (lines 123-131) and midSpan (lines 156-193). For the end-to-end test, use a channel or sync primitive to wait for the handler to populate midSpan instead of the fixed sleep.
Also applies to: 97-103, 123-131, 139-148, 156-193
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@execution/grpc/otel_propagation_test.go` around lines 83 - 89, The test has
data races because gotTraceparent, gotBaggage, and midSpan are written from
interceptor/handler goroutines and read from the test goroutine; replace the
unsynchronized string/bool accesses with a sync primitive (e.g., a single
sync.Mutex protecting gotTraceparent and gotBaggage and a channel or sync.Cond
to signal midSpan readiness) so writes in captureHeader (UnaryInterceptorFunc)
and the handler acquire the mutex before setting values and signal the channel
after setting midSpan, and the test goroutine waits on that channel (or locks
the mutex) instead of sleeping; update all locations referencing gotTraceparent,
gotBaggage, and midSpan to use the chosen synchronization to eliminate race
detector warnings.
| var midSpan trace.Span | ||
|
|
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat -n execution/grpc/otel_propagation_test.go | head -200Repository: evstack/ev-node
Length of output: 7290
🏁 Script executed:
tail -n +195 execution/grpc/otel_propagation_test.go | head -50Repository: evstack/ev-node
Length of output: 332
Fix concurrent access to midSpan without race condition or timing-dependent waits.
midSpan is written in the handler goroutine (line 170) and read in the test goroutine (line 192) without synchronization. The time.Sleep(10ms) is brittle and timing-dependent, violating the determinism requirement for tests. Use channel-based synchronization to pass the trace ID explicitly instead.
Suggested fix (channel-based synchronization)
- var midSpan trace.Span
+ midTraceIDCh := make(chan trace.TraceID, 1)
@@
upstreamExec := &mockExecutor{getTxsFunc: func(ctx context.Context) ([][]byte, error) {
ctx, span := tracer.Start(ctx, "upstream-mid")
- midSpan = span
+ midTraceIDCh <- span.SpanContext().TraceID()
defer span.End()
@@
- time.Sleep(10 * time.Millisecond)
-
rootTraceID := root.SpanContext().TraceID()
- if midSpan.SpanContext().TraceID() != rootTraceID {
+ var midTraceID trace.TraceID
+ select {
+ case midTraceID = <-midTraceIDCh:
+ case <-time.After(time.Second):
+ t.Fatalf("timeout waiting for upstream-mid span")
+ }
+ if midTraceID != rootTraceID {
t.Fatalf("mid span trace id mismatch")
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@execution/grpc/otel_propagation_test.go` around lines 156 - 157, The test
currently writes to shared variable midSpan from the handler goroutine and reads
it from the test goroutine with a time.Sleep, causing a race; change this to a
channel-based handoff: create a chan trace.Span or chan trace.SpanContext (or
chan string for traceID) in the test, have the handler (the goroutine that
currently assigns midSpan) send the span (or span.Context()/TraceID) on that
channel, and have the test goroutine receive from the channel before asserting
(replace uses of midSpan with the received value); update references to midSpan
in the handler and assertions to use the received value to eliminate the race
and remove the timing-dependent sleep.
# Conflicts: # execution/grpc/client.go

Overview
When OLTP wiring was set up, it was only wired for HTTP, this adds the grpc interceptor
Summary by CodeRabbit
New Features
Tests
Chores